Goto

Collaborating Authors

 private inference


PrivCirNet: Efficient Private Inference via Block Circulant Transformation

Neural Information Processing Systems

Homomorphic encryption (HE)-based deep neural network (DNN) inference protects data and model privacy but suffers from significant computation overhead. We observe transforming the DNN weights into circulant matrices converts general matrix-vector multiplications into HE-friendly 1-dimensional convolutions, drastically reducing the HE computation cost.


CryptoNAS: Private Inference on a ReLU Budget

Neural Information Processing Systems

Machine learning as a service has given raise to privacy concerns surrounding clients' data and providers' models and has catalyzed research in private inference (PI): methods to process inferences without disclosing inputs. Recently, researchers have adapted cryptographic techniques to show PI is possible, however all solutions increase inference latency beyond practical limits. This paper makes the observation that existing models are ill-suited for PI and proposes a novel NAS method, named CryptoNAS, for finding and tailoring models to the needs of PI. The key insight is that in PI operator latency cost are inverted: non-linear operations (e.g., ReLU) dominate latency, while linear layers become effectively free. We develop the idea of a ReLU budget as a proxy for inference latency and use CryptoNAS to build models that maximize accuracy within a given budget. CryptoNAS improves accuracy by 3.4% and latency by 2.4x over the state-of-the-art.


Iron: Private Inference on Transformers

Neural Information Processing Systems

We initiate the study of private inference on Transformer-based models in the client-server setting, where clients have private inputs and servers hold proprietary models. Our main contribution is to provide several new secure protocols for matrix multiplication and complex non-linear functions like Softmax, GELU activations, and LayerNorm, which are critical components of Transformers. Specifically, we first propose a customized homomorphic encryption-based protocol for matrix multiplication that crucially relies on a novel compact packing technique. This design achieves $\sqrt{m} \times$ less communication ($m$ is the number of rows of the output matrix) over the most efficient work. Second, we design efficient protocols for three non-linear functions via integrating advanced underlying protocols and specialized optimizations. Compared to the state-of-the-art protocols, our recipes reduce about half of the communication and computation overhead. Furthermore, all protocols are numerically precise, which preserve the model accuracy of plaintext. These techniques together allow us to implement \Name, an efficient Transformer-based private inference framework. Experiments conducted on several real-world datasets and models demonstrate that \Name achieves $3 \sim 14\times$ less communication and $3 \sim 11\times$ less runtime compared to the prior art.


A Related Works 318 Private inference has been a promising solution to protect both data and model privacy during deep

Neural Information Processing Systems

Private inference framework CoPriv adopts CypTFlow2 [39] protocol for private inference. MBps bandwidth and 80ms echo latency. We train our proposed CoPriv with self-distillation. For the network re-parameterization mentioned in Section 4.2, here we provide the For some input size, the input cannot be covered by tiles. The correctness and equivalence can be proved with Eq. 1. Also, [ Winograd convolution for stride of 2. The algorithm can be nested with itself to obtain a 2D algorithm The correctness analysis is the same with Section D.3.



PrivDFS: Private Inference via Distributed Feature Sharing against Data Reconstruction Attacks

Liu, Zihan, Wen, Jiayi, Wu, Junru, Zou, Xuyang, Tan, Shouhong, Zheng, Zhirun, Huang, Cheng

arXiv.org Artificial Intelligence

In this paper, we introduce PrivDFS, a distributed feature-sharing framework for input-private inference in image classification. A single holistic intermediate representation in split inference gives diffusion-based Data Reconstruction Attacks (DRAs) sufficient signal to reconstruct the input with high fidelity. PrivDFS restructures this vulnerability by fragmenting the representation and processing the fragments independently across a majority-honest set of servers. As a result, each branch observes only an incomplete and reconstruction-insufficient view of the input. To realize this, PrivDFS employs learnable binary masks that partition the intermediate representation into sparse and largely non-overlapping feature shares, each processed by a separate server, while a lightweight fusion module aggregates their predictions on the client. This design preserves full task accuracy when all branches are combined, yet sharply limits the reconstructive power available to any individual server. PrivDFS applies seamlessly to both ResNet-based CNNs and Vision Transformers. Across CIFAR-10/100, CelebA, and ImageNet-1K, PrivDFS induces a pronounced collapse in DRA performance, e.g., on CIFAR-10, PSNR drops from 23.25 -> 12.72 and SSIM from 0.963 -> 0.260, while maintaining accuracy within 1% of non-private split inference. These results establish structural feature partitioning as a practical and architecture-agnostic approach to reducing reconstructive leakage in cloud-based vision inference.


PrivCirNet: Efficient Private Inference via Block Circulant Transformation

Neural Information Processing Systems

Homomorphic encryption (HE)-based deep neural network (DNN) inference protects data and model privacy but suffers from significant computation overhead. We observe transforming the DNN weights into circulant matrices converts general matrix-vector multiplications into HE-friendly 1-dimensional convolutions, drastically reducing the HE computation cost.




N-Parties Private Structure and Parameter Learning for Sum-Product Networks

Heilmann, Xenia, Althaus, Ernst, Cerrato, Mattia, Rassau, Nick Johannes Peter, Dousti, Mohammad Sadeq, Kramer, Stefan

arXiv.org Artificial Intelligence

A sum-product network (SPN) is a graphical model that allows several types of probabilistic inference to be performed efficiently. In this paper, we propose a privacy-preserving protocol which tackles structure generation and parameter learning of SPNs. Additionally, we provide a protocol for private inference on SPNs, subsequent to training. To preserve the privacy of the participants, we derive our protocol based on secret sharing, which guarantees privacy in the honest-but-curious setting even when at most half of the parties cooperate to disclose the data. The protocol makes use of a forest of randomly generated SPNs, which is trained and weighted privately and can then be used for private inference on data points. Our experiments indicate that preserving the privacy of all participants does not decrease log-likelihood performance on both homogeneously and heterogeneously partitioned data. We furthermore show that our protocol's performance is comparable to current state-of-the-art SPN learners in homogeneously partitioned data settings. In terms of runtime and memory usage, we demonstrate that our implementation scales well when increasing the number of parties, comparing favorably to protocols for neural networks, when they are trained to reproduce the input-output behavior of SPNs.